Picture for Xin Tao

Xin Tao

Stable Velocity: A Variance Perspective on Flow Matching

Add code
Feb 05, 2026
Viaarxiv icon

VMonarch: Efficient Video Diffusion Transformers with Structured Attention

Add code
Jan 29, 2026
Viaarxiv icon

SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer

Add code
Jan 23, 2026
Viaarxiv icon

CamPilot: Improving Camera Control in Video Diffusion Model with Efficient Camera Reward Feedback

Add code
Jan 22, 2026
Viaarxiv icon

A Mechanistic View on Video Generation as World Models: State and Dynamics

Add code
Jan 22, 2026
Viaarxiv icon

StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Add code
Dec 18, 2025
Figure 1 for StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors
Figure 2 for StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors
Figure 3 for StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors
Figure 4 for StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors
Viaarxiv icon

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Add code
Dec 18, 2025
Viaarxiv icon

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Add code
Dec 16, 2025
Viaarxiv icon

Astra: General Interactive World Model with Autoregressive Denoising

Add code
Dec 15, 2025
Figure 1 for Astra: General Interactive World Model with Autoregressive Denoising
Figure 2 for Astra: General Interactive World Model with Autoregressive Denoising
Figure 3 for Astra: General Interactive World Model with Autoregressive Denoising
Figure 4 for Astra: General Interactive World Model with Autoregressive Denoising
Viaarxiv icon

UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Add code
Dec 08, 2025
Viaarxiv icon